首页> 外文OA文献 >An aspect query language model based on query decomposition and high-order contextual term associations
【2h】

An aspect query language model based on query decomposition and high-order contextual term associations

机译:基于查询分解和高阶上下文项关联的方面查询语言模型

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

In information retrieval (IR) research, more and more focus has been placed on optimizing a query language model by detecting and estimating the dependencies between the query and the observed terms occurring in the selected relevance feedback documents. In this paper, we propose a novel Aspect Language Modeling framework featuring term association acquisition, document segmentation, query decomposition, and an Aspect Model (AM) for parameter optimization. Through the proposed framework, we advance the theory and practice of applying high-order and context-sensitive term relationships to IR. We first decompose a query into subsets of query terms. Then we segment the relevance feedback documents into chunks using multiple sliding windows. Finally we discover the higher order term associations, that is, the terms in these chunks with high degree of association to the subsets of the query. In this process, we adopt an approach by combining the AM with the Association Rule (AR) mining. In our approach, the AM not only considers the subsets of a query as “hidden” states and estimates their prior distributions, but also evaluates the dependencies between the subsets of a query and the observed terms extracted from the chunks of feedback documents. The AR provides a reasonable initial estimation of the high-order term associations by discovering the associated rules from the document chunks. Experimental results on various TREC collections verify the effectiveness of our approach, which significantly outperforms a baseline language model and two state-of-the-art query language models namely the Relevance Model and the Information Flow model
机译:在信息检索(IR)研究中,越来越多的重点放在通过检测和估计查询与所选相关性反馈文档中出现的观察词之间的依赖关系来优化查询语言模型。在本文中,我们提出了一种新颖的方面语言建模框架,该框架具有术语关联获取,文档分段,查询分解以及用于参数优化的方面模型(AM)。通过提出的框架,我们推进了将高阶和上下文相关的术语关系应用于IR的理论和实践。我们首先将查询分解为查询字词的子集。然后,我们使用多个滑动窗口将相关性反馈文档细分为多个块。最后,我们发现了更高阶的术语关联,即这些块中与查询子集具有高度关联的术语。在此过程中,我们采用将AM与关联规则(AR)挖掘相结合的方法。在我们的方法中,AM不仅将查询的子集视为“隐藏”状态并估计其先验分布,而且还评估查询的子集与从反馈文档块中提取的观察到的术语之间的依赖性。通过从文档块中发现关联的规则,AR提供了对高阶术语关联的合理初始估计。在各种TREC集合上的实验结果证明了我们方法的有效性,该方法明显优于基准语言模型和两种最新的查询语言模型,即相关性模型和信息流模型

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号